Welcome, friends!

Please go forth and scroll down to begin your journey though my data re-analysis assignment for AN 597. Here I have provided some basic background and performed/replicated all analyses in a submitted manuscript, Warkentin et al. 201-tbd. The original paper and full dataset used are uploaded onto the repository in pdf and csv form, respectively.

Introduction

Red-eyed treefrogs, Agalychnis callidryas (pictured above as an adult and a juvenile), commonly lay their eggs (pictured below left) on leafy substrate overhanging tropical ponds .

These arboreal embryos can hatch prematurely to escape from egg predators, such as the parrot snake (pictured above right), cued by vibrations in attacks.

It is known that young embryos modulate hatching based on multiple frequency and temporal properties of cues, reducing false alarms that unnecessarily expose them to risk in the water.

Hypothesis

Because the cost of false alarms decreases developmentally we hypothesized that, if sampling costs are high or stimuli ambiguous, older embryos would accept more false alarms.

Methods

Experimental design

We assessed changes in sensitivity to sampling costs using vibration playbacks (to trays of embryos, pictured above) at two developmental stages. We designed sets of 3 stimuli, based on prior results with younger embryos, so one elicited high hatching and two elicited similarly low hatching, but sampling costs differed between low-hatching stimuli.

Statistics

The original paper and full dataset used are uploaded onto the repository in pdf and csv form, respectively. I will replicate all analyses included in the paper, including:
* mann-whitney-wilcoxon tests
* two-sample t-tests
* wilcoxon rank sum tests
* binomial GLMs
* interaction plots
* ANOVAs
* AIC model comparisons

Code preparation

File logistics

First, we will clear the environment and set our working directory:

rm(list=ls()) #clear environment
setwd('/Users/juliejung/Documents/GitHub/AN 597/data-reanalysis-assignment') #set working directory       

User Defined Functions

Here we’ll define some functions that will help us later. The following function will give us the mode:

# gives mode
Mode <- function(x) {
  ux <- unique(x)
  ux[which.max(tabulate(match(x, ux)))]
}

Load packages

Here, we install and load packages that we’ll need later on in the code. NOTE: uncomment as needed.

# install.packages("stargazer")
# install.packages("knitr")
# install.packages("dplyr")
# install.packages("curl")
# install.packages("sciplot")
# install.packages("ggplot2")
# install.packages("MASS")
# install.packages("multcomp")
# install.packages("AICcmodavg")
# install.packages("car")

library("stargazer")
## 
## Please cite as:
##  Hlavac, Marek (2014). stargazer: LaTeX code and ASCII text for well-formatted regression and summary statistics tables.
##  R package version 5.1. http://CRAN.R-project.org/package=stargazer
library("knitr")
library("dplyr")
## 
## Attaching package: 'dplyr'
## The following object is masked from 'package:stats':
## 
##     filter
## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library("curl")
## Warning: package 'curl' was built under R version 3.1.3
library("sciplot")
library("ggplot2")
## Warning: package 'ggplot2' was built under R version 3.1.3
library("MASS")
## Warning: package 'MASS' was built under R version 3.1.3
## 
## Attaching package: 'MASS'
## The following object is masked from 'package:dplyr':
## 
##     select
library("multcomp")
## Warning: package 'multcomp' was built under R version 3.1.3
## Loading required package: mvtnorm
## Loading required package: survival
## Loading required package: splines
## Loading required package: TH.data
library("AICcmodavg")
library("car")
## Warning: package 'car' was built under R version 3.1.3

Read in data

Next, we want to read in the relevant datafile (from the data-reanalysis-assignment repo) and show a few lines of raw data in your output (e.g., using head()).

f <- curl("https://raw.githubusercontent.com/jamjulie/data-reanalysis-assignment/master/WarkentinJungRuedaMcDaniel-DataForDeposit.csv")
data <- read.csv(f, header = TRUE, sep = ",", stringsAsFactors = TRUE)
head(data)
##       Date Age.d. Set TestOrder STIMULUS SetupTime.h.  Tray
## 1 20151013    5.2   1         1       LS         3.02 390-2
## 2 20151013    5.2   1         2       LF         3.35   390
## 3 20151013    5.2   1         3       HF         3.75   370
## 4 20151013    5.2   2         4       HF         4.08 388-2
## 5 20151013    5.2   2         5       LS         4.45   388
## 6 20151013    5.2   2         6       LF         4.82 383-2
##   AlreadyHatchedorRuptured HatchedinSetup LessDeveloped TestEggs
## 1                        1              1             0       13
## 2                        3              0             0       12
## 3                        2              0             0       13
## 4                        1              2             0       12
## 5                        0              0             0       15
## 6                        3              0             1       11
##   FirstHatch.s. FirstHatch.s.600 CycleLength.s. FirstHatch.cycles.
## 1            90               90             17           5.294118
## 2            79               79              2          39.500000
## 3            NA              600              2                 NA
## 4            NA              600              2                 NA
## 5            67               67             17           3.941176
## 6            24               24              2          12.000000
##   FirstHatch.cycles.600 X5minHatch X10minHatch ProportionHatched
## 1              5.294118          2           2             0.154
## 2             39.500000          2           2             0.167
## 3            300.000000          0           0             0.000
## 4            300.000000          0           0             0.000
## 5              3.941176          3           3             0.200
## 6             12.000000          1           2             0.182
##   GutCoilStage T1length T2length T3length MeanLength
## 1           NA   10.793    8.124   11.124     10.014
## 2           NA   10.514   10.698   11.134     10.782
## 3           NA   10.489   10.527   11.000     10.672
## 4            0   10.244   10.768   10.644     10.552
## 5            0   10.655   10.346   11.501     10.834
## 6            1   10.856   11.158   10.323     10.779

Results, part I

Now that we’ve gone through all our preparations, we can run analyses and report replicated results! The following results follow the order of presented results in the paper (pdf in repo).

First, we will check there are at least 8 eggs per tray.

min(data$TestEggs, na.rm=T) 
## [1] 8

Now let’s look at the structure of our data, and make sure the variables are defined as we want them to be defined.

str(data)
## 'data.frame':    81 obs. of  24 variables:
##  $ Date                    : int  20151013 20151013 20151013 20151013 20151013 20151013 20151013 20151013 20151013 20151013 ...
##  $ Age.d.                  : num  5.2 5.2 5.2 5.2 5.2 5.2 5.2 5.2 5.2 5.2 ...
##  $ Set                     : int  1 1 1 2 2 2 3 3 3 4 ...
##  $ TestOrder               : int  1 2 3 4 5 6 7 8 9 10 ...
##  $ STIMULUS                : Factor w/ 3 levels "HF","LF","LS": 3 2 1 1 3 2 3 1 2 1 ...
##  $ SetupTime.h.            : num  3.02 3.35 3.75 4.08 4.45 4.82 5.27 5.77 6.08 6.52 ...
##  $ Tray                    : Factor w/ 81 levels "370","371-2",..: 27 26 1 24 23 18 17 25 22 19 ...
##  $ AlreadyHatchedorRuptured: int  1 3 2 1 0 3 1 1 0 0 ...
##  $ HatchedinSetup          : int  1 0 0 2 0 0 0 0 0 1 ...
##  $ LessDeveloped           : int  0 0 0 0 0 1 0 0 0 0 ...
##  $ TestEggs                : int  13 12 13 12 15 11 14 14 15 14 ...
##  $ FirstHatch.s.           : int  90 79 NA NA 67 24 NA 58 34 NA ...
##  $ FirstHatch.s.600        : int  90 79 600 600 67 24 600 58 34 600 ...
##  $ CycleLength.s.          : int  17 2 2 2 17 2 17 2 2 2 ...
##  $ FirstHatch.cycles.      : num  5.29 39.5 NA NA 3.94 ...
##  $ FirstHatch.cycles.600   : num  5.29 39.5 300 300 3.94 ...
##  $ X5minHatch              : int  2 2 0 0 3 1 0 2 5 0 ...
##  $ X10minHatch             : int  2 2 0 0 3 2 1 2 5 0 ...
##  $ ProportionHatched       : num  0.154 0.167 0 0 0.2 0.182 0.071 0.143 0.333 0 ...
##  $ GutCoilStage            : int  NA NA NA 0 0 1 1 0 0 1 ...
##  $ T1length                : num  10.8 10.5 10.5 10.2 10.7 ...
##  $ T2length                : num  8.12 10.7 10.53 10.77 10.35 ...
##  $ T3length                : Factor w/ 80 levels "10.323","10.368",..: 21 22 16 7 41 1 10 13 2 20 ...
##  $ MeanLength              : num  10 10.8 10.7 10.6 10.8 ...
data$GutCoilStage<-as.numeric(as.character(data$GutCoilStage))
data$T3length<-as.numeric(as.character(data$T3length))
data$ProportionHatched<-as.numeric(as.character(data$ProportionHatched))
data$STIMULUS<-as.factor(data$STIMULUS)
data$FirstHatch.s. <-as.numeric(as.character(data$FirstHatch.s. ))
data$FirstHatch.cycles. <-as.numeric(as.character(data$FirstHatch.cycles. ))

Here let’s subset the data into 2 age groups: younger and older.

younger<-subset(data, Age.d.==5.2, na.rm=T)
older<-subset(data, Age.d.==5.7, na.rm=T)

Since the paper reports the mode of stages for each age category, let’s find that:

Mode(younger$GutCoilStage)
## [1] 1
Mode(older$GutCoilStage)
## [1] 3

Younger embryos showed a range of stages from intact yolks to curved furrows while older embryos had straight furrows to full coils (N = 39 and 42; modes: straight furrow(1) and full coil(3)).

Summarize the data

Here we’ll create a table summary of statistics for hatching rates (in proportion hatched) for each age category, grouped by stimulus treatment. We use the {dplyr} package to summarize our data.

younger_errorstats_g8<-
  younger %>%
  group_by(STIMULUS) %>%
  summarize(count = n(),
            mean = mean(ProportionHatched),
            SD = sd(ProportionHatched), 
            SE = sd(ProportionHatched)/sqrt(n())
            )
younger_errorstats_g8$AgeGroup <- 5.2
kable(younger_errorstats_g8,title="Mean & SD & SE",digits=3)
STIMULUS count mean SD SE AgeGroup
HF 13 0.061 0.056 0.016 5.2
LF 13 0.376 0.307 0.085 5.2
LS 13 0.099 0.085 0.024 5.2
older_errorstats_g8<-
  older %>%
  group_by(STIMULUS) %>%
  summarize(count = n(),
            mean = mean(ProportionHatched),
            SD = sd(ProportionHatched), 
            SE = sd(ProportionHatched)/sqrt(n())
            )
older_errorstats_g8$AgeGroup <- 5.7
kable(older_errorstats_g8,title="Mean & SD & SE",digits=3)
STIMULUS count mean SD SE AgeGroup
HF 14 0.211 0.211 0.056 5.7
LF 14 0.669 0.232 0.062 5.7
LS 14 0.582 0.371 0.099 5.7

Should we use parametric or non-parametric tests?

hist(data$GutCoilStage) #non parametric

hist(younger$GutCoilStage)

hist(older$GutCoilStage)

We’ll use mann-whitney-wilcoxon test because our 2 data samples (younger and older) are independent/come from distinct populations, so the samples do not affect each other.

Also because our data are non-parametric.

wtest<-wilcox.test(data$GutCoilStage~data$Age.d., mu = 0, alt="two.sided", paired = F, conf.int=T, conf.level=0.95)
qnorm(wtest$p.value) #z-value to report
## [1] -6.606223
wtest
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  data$GutCoilStage by data$Age.d.
## W = 88.5, p-value = 1.971e-11
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
##  -1.999996 -1.000003
## sample estimates:
## difference in location 
##              -1.999985

Younger embryos showed a range of stages from intact yolks to curved furrows while older embryos had straight furrows to full coils (N = 39 and 42; modes: straight furrow(1) and full coil(3); Wilcoxon test, Z=6.606223, P =1.97e-11).

Moving right along, here we’ll calculation the total lengths (mean and se) of hatchlings in each age group.

mean(younger$MeanLength)
## [1] 11.06536
mean(older$MeanLength)
## [1] 11.64307
se(younger$MeanLength)
## [1] 0.07123206
se(older$MeanLength)
## [1] 0.0546835

Embryos grew between test ages. Hatchling total length increased from 11.1 ± 0.07 to 11.6 ± 0.05 mm (mean ± SE, N = 39 and 42 trays respectively).

Since the data are normal and we get equal variances, we’ll perform a t-test.

hist(data$MeanLength)#normal

var.test(younger$MeanLength, older$MeanLength) # equal variances
## 
##  F test to compare two variances
## 
## data:  younger$MeanLength and older$MeanLength
## F = 1.5756, num df = 38, denom df = 41, p-value = 0.1554
## alternative hypothesis: true ratio of variances is not equal to 1
## 95 percent confidence interval:
##  0.8396433 2.9795488
## sample estimates:
## ratio of variances 
##           1.575628
t.test(younger$MeanLength, older$MeanLength, alternative=c("two.sided"), paired=FALSE, var.equal=TRUE, conf.level=0.95)
## 
##  Two Sample t-test
## 
## data:  younger$MeanLength and older$MeanLength
## t = -6.4874, df = 79, p-value = 7.01e-09
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.7549655 -0.4004594
## sample estimates:
## mean of x mean of y 
##  11.06536  11.64307

Embryos grew between test ages. Hatchling total length increased from 11.1 ± 0.07 to 11.6 ± 0.05 mm (mean ± SE, N = 39 and 42 trays respectively; t-test, t79 = 6.4871, P =7.018e-9; Fig. 4B).

Embryos also hatched spontaneously between test ages, so we want to calculate some means and SEs for spontaneous hatching before the test period.

mean(younger$AlreadyHatchedorRuptured)
## [1] 0.5128205
se(younger$AlreadyHatchedorRuptured)
## [1] 0.1317985
mean(older$AlreadyHatchedorRuptured)
## [1] 0.9047619
se(older$AlreadyHatchedorRuptured)
## [1] 0.1219733

From trays tested, more of the older embryos had hatched spontaneously before the test period (0.9 ± 0.1 vs. 0.5 ± 0.1 embryos per tray).

Since the data are non-parametric, we’ll perform a mann-whitney test:

hist(data$AlreadyHatchedorRuptured)#non-parametric, mann-whitney

wtest<-wilcox.test(data$AlreadyHatchedorRuptured~data$Age.d., mu = 0, alt="two.sided", paired = F, conf.int=T, conf.level=0.95)
## Warning in wilcox.test.default(x = c(1L, 3L, 2L, 1L, 0L, 3L, 1L, 1L, 0L, :
## cannot compute exact p-value with ties
## Warning in wilcox.test.default(x = c(1L, 3L, 2L, 1L, 0L, 3L, 1L, 1L, 0L, :
## cannot compute exact confidence intervals with ties
qnorm(wtest$p.value) #z-value to report
## [1] -2.439028
wtest
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  data$AlreadyHatchedorRuptured by data$Age.d.
## W = 559.5, p-value = 0.007363
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
##  -9.999772e-01 -6.210509e-05
## sample estimates:
## difference in location 
##          -2.881878e-05

From trays tested, more of the older embryos had hatched spontaneously before the test period (0.9 ± 0.1 vs. 0.5 ± 0.1 embryos per tray; Wilcoxon test, Z=2.439, P = 0.0074) and relatively fewer of the trays with older eggs had sufficient individuals to attempt setup (KMW unquantified personal observation).

Now we’ll look at the number of embryos that hatched at set up:

# hatching from set up
mean(younger$HatchedinSetup)
## [1] 0.8974359
se(younger$HatchedinSetup)
## [1] 0.2317418
mean(older$HatchedinSetup)
## [1] 2.452381
se(older$HatchedinSetup)
## [1] 0.2974971

More of the older embryos hatched during setup and acclimation (2.5 ± 0.3 vs. 0.9 ± 0.2 per tray).

Since the data are non-parametric, we’ll perform a mann-whitney test:

hist(data$HatchedinSetup)#non-parametric, mann-whitney

wtest<-wilcox.test(data$HatchedinSetup~data$Age.d., mu = 0, alt="two.sided", paired = F, conf.int=T, conf.level=0.95)
## Warning in wilcox.test.default(x = c(1L, 0L, 0L, 2L, 0L, 0L, 0L, 0L, 0L, :
## cannot compute exact p-value with ties
## Warning in wilcox.test.default(x = c(1L, 0L, 0L, 2L, 0L, 0L, 0L, 0L, 0L, :
## cannot compute exact confidence intervals with ties
qnorm(wtest$p.value) #z-value to report
## [1] -3.886517
wtest
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  data$HatchedinSetup by data$Age.d.
## W = 404.5, p-value = 5.085e-05
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
##  -2.0000029 -0.9999489
## sample estimates:
## difference in location 
##              -1.000005

More of the older embryos hatched during setup and acclimation (2.5 ± 0.3 vs. 0.9 ± 0.2 per tray; Wilcoxon test, Z=404.5, P = 5.085e-5).

We want to calculate some means and ses for the number of test eggs per tray in older vs. younger.

#smaller number of test eggs per tray in older than younger
mean(younger$TestEggs)
## [1] 13.46154
se(younger$TestEggs)
## [1] 0.2404622
mean(older$TestEggs)
## [1] 11.45238
se(older$TestEggs)
## [1] 0.3089873
min(younger$TestEggs)
## [1] 8
max(younger$TestEggs)
## [1] 15
min(older$TestEggs)
## [1] 8
max(older$TestEggs)
## [1] 15

This resulted in smaller numbers of test eggs per tray (older 11.5 ± 0.3, younger 13.5 ± 0.2, range 8–15 at both ages).

Since the data are non-parametric, we’ll perform a mann-whitney test:

hist(data$TestEggs)#non-parametric, mann-whitney

wtest<-wilcox.test(data$TestEggs~data$Age.d., mu = 0, alt="two.sided", paired = F, conf.int=T, conf.level=0.95)
## Warning in wilcox.test.default(x = c(13L, 12L, 13L, 12L, 15L, 11L, 14L, :
## cannot compute exact p-value with ties
## Warning in wilcox.test.default(x = c(13L, 12L, 13L, 12L, 15L, 11L, 14L, :
## cannot compute exact confidence intervals with ties
qnorm(wtest$p.value) #z-value to report
## [1] -4.361872
wtest
## 
##  Wilcoxon rank sum test with continuity correction
## 
## data:  data$TestEggs by data$Age.d.
## W = 1289.5, p-value = 6.448e-06
## alternative hypothesis: true location shift is not equal to 0
## 95 percent confidence interval:
##  1.000044 2.999979
## sample estimates:
## difference in location 
##               2.000011

This resulted in smaller numbers of test eggs per tray (Wilcoxon test, Z=4.361872, P =6.448e-6; older 11.5 ± 0.3, younger 13.5 ± 0.2, range 8–15 at both ages).

Results, part II: Visualizations

Next, we want to know whether age, stimulus, and/or their interaction affected the hatching response of embryos in playbacks

We’ll do this by performing a binomial GLM:

#binomial glm
glm1<-glm(cbind(X10minHatch,TestEggs)~Age.d., family=binomial(logit), data=data)
glm2<-glm(cbind(X10minHatch,TestEggs)~STIMULUS, family=binomial(logit), data=data)
glm3<-glm(cbind(X10minHatch,TestEggs)~Age.d.+STIMULUS, family=binomial(logit), data=data)
glm4<-glm(cbind(X10minHatch,TestEggs)~Age.d.*STIMULUS, family=binomial(logit), data=data)

glms<-list(glm1, glm2, glm3, glm4)
aictab(glms)
## Warning in aictab.AICglm.lm(glms): 
## Model names have been supplied automatically in the table
## 
## Model selection based on AICc :
## 
##      K   AICc Delta_AICc AICcWt Cum.Wt      LL
## Mod4 6 329.47       0.00   0.97   0.97 -158.17
## Mod3 4 336.80       7.33   0.03   1.00 -164.14
## Mod1 2 390.29      60.82   0.00   1.00 -193.07
## Mod2 3 394.16      64.69   0.00   1.00 -193.93
Anova(glm4)
## Analysis of Deviance Table (Type II tests)
## 
## Response: cbind(X10minHatch, TestEggs)
##                 LR Chisq Df Pr(>Chisq)    
## Age.d.            59.578  1  1.175e-14 ***
## STIMULUS          57.863  2  2.724e-13 ***
## Age.d.:STIMULUS   11.934  2   0.002562 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Age, stimulus, and their interaction affected the hatching response of embryos in playbacks (Fig. 6A). Older embryos hatched more than did younger ones (binomial GLM, age effect: χ2 = 107.71, df = 1, P < 0.0001), and the LF stimulus elicited the strongest hatching response (stimulus effect: χ2 = 57.57, df = 2, P < 0.0001). However, the significant age × stimulus interaction revealed that the developmental increase in hatching was not uniform across stimuli (interaction effect: χ2 = 12.55, df = 2, P = 0.0019).

This is the plot for our proportion hatched data (Figure 6 in paper)

interaction.plot(data$Age.d., data$STIMULUS, data$ProportionHatched, 
                 leg.bty="0", pch=c(18,24), 
                 ylab="Proportion hatched", 
                 xlab="Age (days)", 
                 trace.label="")

The image of the figure from the original paper that I’m replicating is included in a folder called “img” within my repo and embedded here:

Specifically, we’re looking at part A of the figure.

ALTERNATIVELY, we can use ggplot to create a much prettier plot, with standard errors.

younger_errorstats_g8$AgeGroup<-as.factor(younger_errorstats_g8$AgeGroup)
older_errorstats_g8$AgeGroup<-as.factor(older_errorstats_g8$AgeGroup)
combined_errorstats<- rbind(younger_errorstats_g8, older_errorstats_g8)

The following code creates our figure:

ggplot(data=combined_errorstats, aes(x=AgeGroup, y=mean, colour=STIMULUS)) + 
  geom_point(size=3) +
  geom_errorbar(data=combined_errorstats, aes(ymin=mean-SE, ymax=mean+SE), width=0.05)+
  theme_bw(20)+
  ylab("Proportion of tray hatched\n")+
  xlab("\n Age (d)")+
  annotate("text", x = 1, y = 0.2, label = "N=13")+
  annotate("text", x = 2, y = 0.4, label = "N=14")

Moving on: Here we perform separate binomial GLMs at each age, with grouped categories of treatment stimuli.

#binomial glm for YOUNGER
glm2<-glm(cbind(X10minHatch,TestEggs)~STIMULUS, family=binomial(logit), data=younger)
Anova(glm2)
## Analysis of Deviance Table (Type II tests)
## 
## Response: cbind(X10minHatch, TestEggs)
##          LR Chisq Df Pr(>Chisq)    
## STIMULUS   40.133  2  1.929e-09 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#binomial glm for OLDER
glm2<-glm(cbind(X10minHatch,TestEggs)~STIMULUS, family=binomial(logit), data=older)
Anova(glm2)
## Analysis of Deviance Table (Type II tests)
## 
## Response: cbind(X10minHatch, TestEggs)
##          LR Chisq Df Pr(>Chisq)    
## STIMULUS   29.664  2  3.618e-07 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Younger embryos showed equally little response to both the HF and LS stimuli but a substantial hatching response to the LF stimulus (Fig. 6A, stimulus effect: χ2 = 57.57, df = 2, P < 0.0001; HF–LS contrast χ2 = 1.48, P = 0.22; (HF+LS)–LF contrast χ2 = 370 56.75, P < 0.0001). In contrast, older embryos showed similarly strong hatching responses to both LF and LS stimuli and a weaker response to the HF stimulus (stimulus effect: χ2 = 73.84, df = 2, P < 0.0001; LF–LS contrast χ2 = 2.46, P = 0.12; (LF+LS)–HF contrast χ2 = 71.61, P < 0.0001).

Results, part III: Latency Analysis

Because risk in predator attacks probably accrues as a function of time, but information from temporal properties accrues as a function of cycles we conducted analyses of latency measured both in time (seconds) and in cycles (dividing time by the cycle length of the stimulus).

min(data$FirstHatch.s., na.rm=T)
## [1] 9
max(data$FirstHatch.s., na.rm=T)
## [1] 420

In trays where hatching occurred, the latency until the first embryo hatched ranged from 9–420 s.

Are our data parametric?

hist(data$FirstHatch.s.) #nonparametric

hist(data$FirstHatch.cycles.) #nonparametric

hist(log(data$FirstHatch.s.)) #parametric!

hist(log(data$FirstHatch.cycles.)) #parametric!

Since they are, we can use ANOVAs of log-transformed data to test for effects of age, stimulus and age-by-stimulus interaction on the latency to hatch.

# IN SECONDS, Anova
aov1 <- aov(log(FirstHatch.s.) ~ Age.d.*STIMULUS, data=data)
summary(aov1)
##                 Df Sum Sq Mean Sq F value   Pr(>F)    
## Age.d.           1 15.153  15.153   39.52 4.57e-08 ***
## STIMULUS         2 10.518   5.259   13.72 1.33e-05 ***
## Age.d.:STIMULUS  2  2.232   1.116    2.91   0.0625 .  
## Residuals       58 22.238   0.383                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 17 observations deleted due to missingness
# IN CYCLES, Anova
aov2 <- aov(log(FirstHatch.cycles.) ~ Age.d.*STIMULUS, data=data)
summary(aov2)
##                 Df Sum Sq Mean Sq F value   Pr(>F)    
## Age.d.           1  18.99  18.993   49.54 2.53e-09 ***
## STIMULUS         2  55.13  27.565   71.89  < 2e-16 ***
## Age.d.:STIMULUS  2   2.23   1.116    2.91   0.0625 .  
## Residuals       58  22.24   0.383                     
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 17 observations deleted due to missingness

Latency varied among stimuli and decreased with age, with a marginally non-significant interaction effect (Fig. 6B,C). This pattern held whether measuring latency in seconds or in cycles of vibration (seconds: age, F1,58 = 39.52, P < 0.0001; stimulus, F2,58 = 13.72, P < 0.0001; interaction, F2,58 = 2.91, P = 0.0625; cycles: age, F1,58 = 49.54, P < 0.0001; stimulus, F2,58 = 71.89, P < 0.0001; interaction, F2,58 = 2.91, P = 0.0625).

Alternatively, we can use GLMs:

# IN SECONDS, GLM METHOD
glm1<-glm(FirstHatch.s.~Age.d., family=gaussian(link = "identity"), data=data)
glm2<-glm(FirstHatch.s.~STIMULUS, family=gaussian(link = "identity"), data=data)
glm3<-glm(FirstHatch.s.~Age.d.+STIMULUS, family=gaussian(link = "identity"), data=data)
glm4<-glm(FirstHatch.s.~Age.d.*STIMULUS, family=gaussian(link = "identity"), data=data)

glms<-list(glm1, glm2, glm3, glm4)
aictab(glms)
## Warning in aictab.AICglm.lm(glms): 
## Model names have been supplied automatically in the table
## 
## Model selection based on AICc :
## 
##      K   AICc Delta_AICc AICcWt Cum.Wt      LL
## Mod4 7 705.73       0.00   0.95   0.95 -344.87
## Mod3 5 711.47       5.73   0.05   1.00 -350.22
## Mod1 3 719.37      13.64   0.00   1.00 -356.48
## Mod2 4 729.28      23.55   0.00   1.00 -360.30
Anova(glm4)
## Analysis of Deviance Table (Type II tests)
## 
## Response: FirstHatch.s.
##                 LR Chisq Df Pr(>Chisq)    
## Age.d.            25.396  1  4.669e-07 ***
## STIMULUS          14.836  2  0.0006002 ***
## Age.d.:STIMULUS   10.553  2  0.0051100 ** 
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Here we plot the latencies for the first embryo in each tray to hatch:

LtoH_younger_errorstats_g8<-
  younger %>%
  group_by(STIMULUS) %>%
  summarize(count = n(),
            mean = mean(FirstHatch.s., na.rm=T),
            SD = sd(FirstHatch.s., na.rm=T), 
            SE = sd(FirstHatch.s., na.rm=T)/sqrt(n())
            )
LtoH_younger_errorstats_g8$AgeGroup <- "5.2"
kable(LtoH_younger_errorstats_g8,title="Mean & SD & SE",digits=3)
STIMULUS count mean SD SE AgeGroup
HF 13 166.625 130.271 36.131 5.2
LF 13 39.455 33.554 9.306 5.2
LS 13 111.000 74.883 20.769 5.2
LtoH_older_errorstats_g8<-
  older %>%
  group_by(STIMULUS) %>%
  summarize(count = n(),
            mean = mean(FirstHatch.s., na.rm=T),
            SD = sd(FirstHatch.s., na.rm=T), 
            SE = sd(FirstHatch.s., na.rm=T)/sqrt(n())
            )
LtoH_older_errorstats_g8$AgeGroup <- "5.7"
kable(LtoH_older_errorstats_g8,title="Mean & SD & SE",digits=3)
STIMULUS count mean SD SE AgeGroup
HF 14 34.000 22.276 5.954 5.7
LF 14 18.643 7.023 1.877 5.7
LS 14 32.615 20.642 5.517 5.7

Now we can make the plot!

LtoH_combined_errorstats<- rbind(LtoH_younger_errorstats_g8, LtoH_older_errorstats_g8)

The following code creates our figure:

ggplot(data=LtoH_combined_errorstats, aes(x=AgeGroup, y=mean, colour=STIMULUS)) + 
  geom_point(size=3) +
  geom_errorbar(data=LtoH_combined_errorstats, aes(ymin=mean-SE, ymax=mean+SE), width=0.05)+
  theme_bw(20)+
  ylab("Latency to hatch (seconds)\n")+
  xlab("\n Age (d)")

The image of the figure from the original paper that I’m replicating is included in a folder called “img” within my repo and embedded here:

Specifically, we’ve replotted part B of this figure.

Recall that because risk in predator attacks probably accrues as a function of time, but information from temporal properties accrues as a function of cycles, we conducted analyses of latency measured both in time (seconds) and in cycles (dividing time by the cycle length of the stimulus).

## IN CYCLES, GLM METHOD
glm1<-glm(FirstHatch.cycles.~Age.d., family=gaussian(link = "identity"), data=data)
glm2<-glm(FirstHatch.cycles.~STIMULUS, family=gaussian(link = "identity"), data=data)
glm3<-glm(FirstHatch.cycles.~Age.d.+STIMULUS, family=gaussian(link = "identity"), data=data)
glm4<-glm(FirstHatch.cycles.~Age.d.*STIMULUS, family=gaussian(link = "identity"), data=data)

glms<-list(glm1, glm2, glm3, glm4)
aictab(glms)
## Warning in aictab.AICglm.lm(glms): 
## Model names have been supplied automatically in the table
## 
## Model selection based on AICc :
## 
##      K   AICc Delta_AICc AICcWt Cum.Wt      LL
## Mod4 7 599.13       0.00      1      1 -291.57
## Mod3 5 612.07      12.93      0      1 -300.52
## Mod2 4 621.83      22.70      0      1 -306.58
## Mod1 3 629.07      29.93      0      1 -311.33
Anova(glm4)
## Analysis of Deviance Table (Type II tests)
## 
## Response: FirstHatch.cycles.
##                 LR Chisq Df Pr(>Chisq)    
## Age.d.            15.999  1  6.337e-05 ***
## STIMULUS          30.858  2  1.992e-07 ***
## Age.d.:STIMULUS   18.715  2  8.630e-05 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

This pattern held whether measuring latency in seconds or in cycles of vibration (seconds: age, F1,58 = 44.07, P < 0.0001; stimulus, F2,58 = 13.88, P < 0.0001; interaction, F2,58 = 2.91, P = 0.0625; cycles: age, F1,58 = 44.05, P < 0.0001; stimulus, F2,58 = 31.78, P < 0.0001; interaction, F2,58 = 2.91, P = 0.0625).

Here we plot the latencies for the first embryo in each tray to hatch IN CYCLES:

In order to do this, we’ll get the SEs using our summarySE function.

cycles_younger_errorstats_g8<-
  younger %>%
  group_by(STIMULUS) %>%
  summarize(count = n(),
            mean = mean(FirstHatch.cycles., na.rm=T),
            SD = sd(FirstHatch.cycles., na.rm=T), 
            SE = sd(FirstHatch.cycles., na.rm=T)/sqrt(n())
            )
cycles_younger_errorstats_g8$AgeGroup <- "5.2"
kable(cycles_younger_errorstats_g8,title="Mean & SD & SE",digits=3)
STIMULUS count mean SD SE AgeGroup
HF 13 83.312 65.136 18.065 5.2
LF 13 19.727 16.777 4.653 5.2
LS 13 6.529 4.405 1.222 5.2
cycles_older_errorstats_g8<-
  older %>%
  group_by(STIMULUS) %>%
  summarize(count = n(),
            mean = mean(FirstHatch.cycles., na.rm=T),
            SD = sd(FirstHatch.cycles., na.rm=T), 
            SE = sd(FirstHatch.cycles., na.rm=T)/sqrt(n())
            )
cycles_older_errorstats_g8$AgeGroup <- "5.7"
kable(cycles_older_errorstats_g8,title="Mean & SD & SE",digits=3)
STIMULUS count mean SD SE AgeGroup
HF 14 17.000 11.138 2.977 5.7
LF 14 9.321 3.512 0.939 5.7
LS 14 1.919 1.214 0.325 5.7

Now we can make the plot!

cycles_combined_errorstats<- rbind(cycles_younger_errorstats_g8, cycles_older_errorstats_g8)

The following code creates our figure:

ggplot(data=cycles_combined_errorstats, aes(x=AgeGroup, y=mean, colour=STIMULUS)) + 
  geom_point(size=3) +
  geom_errorbar(data=cycles_combined_errorstats, aes(ymin=mean-SE, ymax=mean+SE), width=0.05)+
  theme_bw(20)+
  ylab("Latency to hatch (cycles)\n")+
  xlab("\n Age (d)")

The image of the figure from the original paper that I’m replicating is included in a folder called “img” within my repo and embedded here:

Specifically, we’ve replotted part C of this figure.

HF<-subset(data, STIMULUS=="HF", na.rm=T)
LF<-subset(data, STIMULUS=="LF", na.rm=T)
LS<-subset(data, STIMULUS=="LS", na.rm=T)

HFyounger<-subset(HF, Age.d.==5.2, na.rm=T)
HFolder<-subset(HF, Age.d.==5.7, na.rm=T)
LFyounger<-subset(LF, Age.d.==5.2, na.rm=T)
LFolder<-subset(LF, Age.d.==5.7, na.rm=T)
LSyounger<-subset(LS, Age.d.==5.2, na.rm=T)
LSolder<-subset(LS, Age.d.==5.7, na.rm=T)

mean(LFyounger$FirstHatch.s., na.rm=T)
## [1] 39.45455
se(LFyounger$FirstHatch.s., na.rm=T)
## [1] 10.11692
mean(LFolder$FirstHatch.s., na.rm=T)
## [1] 18.64286
se(LFolder$FirstHatch.s., na.rm=T)
## [1] 1.877007
mean(LSyounger$FirstHatch.cycles., na.rm=T)
## [1] 6.529412
se(LSyounger$FirstHatch.cycles., na.rm=T)
## [1] 1.557356
mean(LSolder$FirstHatch.cycles., na.rm=T)
## [1] 1.918552
se(LSolder$FirstHatch.cycles., na.rm=T)
## [1] 0.3367673

The shortest latency times were for the LF stimulus (younger: 39.5 ± 10.1 s; older 18.6 ± 1.9 s), while latencies for the LS stimulus included the fewest cycles of the vibration pattern (younger: 6.5 ± 1.6; older: 1.9 ± 0.3 cycles).

We conducted analyses of latency both on the subset of trays from which at least one individual hatched and also on the full dataset, assigning a latency of 600 s (i.e., the full playback plus post-playback observation period) to trays in which no embryos hatched.

# IN SECONDS, Anova
aov1 <- aov(FirstHatch.s.600 ~ Age.d.*STIMULUS, data=data)
summary(aov1)
##                 Df  Sum Sq Mean Sq F value  Pr(>F)   
## Age.d.           1  497562  497562  11.521 0.00110 **
## STIMULUS         2  501036  250518   5.801 0.00455 **
## Age.d.:STIMULUS  2   51388   25694   0.595 0.55418   
## Residuals       75 3239146   43189                   
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
# IN SECONDS, GLM METHOD
glm1<-glm(FirstHatch.s.600~Age.d., family=gaussian(link = "identity"), data=data)
glm2<-glm(FirstHatch.s.600~STIMULUS, family=gaussian(link = "identity"), data=data)
glm3<-glm(FirstHatch.s.600~Age.d.+STIMULUS, family=gaussian(link = "identity"), data=data)
glm4<-glm(FirstHatch.s.600~Age.d.*STIMULUS, family=gaussian(link = "identity"), data=data)

glms<-list(glm1, glm2, glm3, glm4)
aictab(glms)
## Warning in aictab.AICglm.lm(glms): 
## Model names have been supplied automatically in the table
## 
## Model selection based on AICc :
## 
##      K    AICc Delta_AICc AICcWt Cum.Wt      LL
## Mod3 5 1100.25       0.00   0.82   0.82 -544.72
## Mod4 7 1103.71       3.46   0.15   0.97 -544.09
## Mod1 3 1107.24       6.99   0.02   0.99 -550.46
## Mod2 4 1109.38       9.13   0.01   1.00 -550.43
Anova(glm4)
## Analysis of Deviance Table (Type II tests)
## 
## Response: FirstHatch.s.600
##                 LR Chisq Df Pr(>Chisq)    
## Age.d.           11.5207  1  0.0006883 ***
## STIMULUS         11.6011  2  0.0030259 ** 
## Age.d.:STIMULUS   1.1899  2  0.5516027    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

If we assign a latency of 600 s to trays in which no embryos hatched, age and stimulus effects remain strong and the interaction effect is much weaker (latency in seconds: age, F1,75 = 27.98, P < 0.0001; stimulus, F2,75 = 7.77, P = 0.0009; interaction F2,75 = 0.75, P = 0.47).

Discussion

Older embryos showed lower latency to hatch, indicating less cue sampling, and more hatching overall. Their similarly high responses to two of the stimuli suggest they ceased to discriminate using slow-to-assess properties as indicators of safety; however, they showed little hatching if either frequency spectrum or a fast temporal pattern allowed rapid assessment of low risk.

Conclusion

Developmental changes in behavior due to ontogenetic adaptation of decision processes are likely to be widespread. Vibration-cued hatching allows us to use the power of playback experiments to improve our understanding of the development of adaptive embryo behavior.